Prompt

Prompt Engineering: process of carefully designing and optimizing instructions (prompts) to elicit the best possible output from generative AI models, especially Large Language Models (LLMs). By providing clear, specific, and well-structured prompts, you can guide the AI to generate relevant, accurate, and high-quality responses
Prompt: input you provide to a generative AI model to request a specific output. It can be a simple question, a set of instructions, or even a creative writing example
Large Language Model (LLM): AI model designed to understand and generate human-like text. LLMs are trained on vast amounts of data and can perform tasks like translation, summarization, and even creative writing
Prompt Template: a pre-defined structure or format for a prompt that can be customized with specific details or variables to generate dynamic prompts
Prompt Tuning: process of fine-tuning pre-trained LLMs by adapting them to specific tasks or domains through prompt engineering, rather than traditional fine-tuning methods
Prompt Injection: a security vulnerability where an attacker manipulates the input prompt to influence the AI model's behavior in unintended ways, potentially leading to unauthorized actions or disclosures
Prompt Leakage: situation where sensitive information from the prompt is inadvertently included in the generated output, posing privacy or security risks
Prompt Bias: tendency of an AI model to generate responses that reflect the biases present in its training data, leading to unfair or inaccurate outcomes
Prompt Hallucination: when an AI model generates information that is not supported by the input prompt or its training data, leading to false or misleading outputs
Prompt Testing: process of evaluating and validating prompts to ensure they produce the desired output, meet quality standards, and comply with ethical and regulatory requirements
Prompt Optimization: continuous process of refining prompts to improve their performance, based on feedback, testing results, and changes in the AI model or its training data
Context Window: max number of tokens the model can process at once, including input and output. Often a model-specific architectural limit

Category	Setting Parameter	Description	Low Value Use Cases	High Value Use Cases
Sampling	Temperature	controls the randomness or "creativity" of the output. Higher values lead to more diverse and imaginative responses, while lower values make the output more deterministic and focused	factual Q&A, summarization	story generation, poetry, brainstorming
	Top-P (Nucleus Sampling)	selects tokens from the smallest possible set whose cumulative probability exceeds the `top_p` threshold. Works in conjunction with temperature to control diversity	precise answers	varied and imaginative text
	Top-K Sampling	limits the token selection to the top `k` most probable tokens at each step. The model will only consider words within this `k` set. Often used in conjunction with Top-P	limits token selection to the top `k` options for more focused output	expands token options for greater diversity and creativity, but may include less relevant choices
Advanced Sampling	Logit Bias	allows you to modify the probability of specific tokens appearing or not appearing in the generated output. You can increase or decrease the likelihood of certain words	reduces the likelihood of tokens with negative bias, prompting the model to avoid specific words	increases the likelihood of tokens with positive bias, encouraging the model to include specific words or phrases
Output Control	Max Length / Max Tokens	sets the maximum number of tokens the model will generate in its response. This includes both the input prompt and the generated output in some APIs	summarization, quick answers: concise, cost-effective responses, cutting off if necessary	essay generation, code generation, detailed explanations: more detailed responses, but manage to avoid irrelevance and high costs
	Stop Sequences	string or list of strings that, when encountered in the generated output, will stop the model from generating further tokens	stops generating text at specified sequences, ensuring structured outputs and preventing run-ons	continues generating until reaching max tokens or an end-of-text token
	N (Number of Completions)	specifies how many independent completions (responses) the model should generate for a single prompt	produces one response, typical for direct answers	creates several distinct responses for selection or variation, potentially increasing cost
Repetition Control	Frequency Penalty	applies a penalty to new tokens based on how many times that token has already appeared in the text (prompt + generated response)	allows repetition with less penalty, increasing the likelihood of repeated words or phrases	imposes a higher penalty on repetition, promoting new vocabulary and discouraging repeated tokens
Repetition Control	Presence Penalty	imposes a uniform penalty on new tokens that have appeared in the text at least once, regardless of their frequency	reduces penalties on previously mentioned tokens to maintain focus on a specific topic	increases penalties on previously used tokens to encourage diverse and distinct ideas
Reproducibility	Seed	setting a seed makes the model's output deterministic for a given set of parameters	guarantees consistent results for repeated calls with the same prompt and settings, aiding debugging and reproducibility	each call with the same prompt and settings yields a different output, while still adhering to other parameters
Input Processing	Context Window (Max Context Length)	maximum number of tokens (input prompt + generated output) that the model can process and consider at one time. This is often a model-specific architectural limit	short prompts limit the model's memory of prior conversation, causing context loss in longer interactions	long conversations and large document analysis allow the model to maintain context, enhancing coherence and relevance in extended interactions
Model Selection	Model Name/ID	specifies the particular LLM variant or version to be used. Different models have varying capabilities, sizes, and training data	smaller models may produce lower quality, less nuanced responses and have limited capabilities	larger models generally provide higher quality, more nuanced responses, but may incur higher costs and slower inference
Generation Strategy	Decoding Type	refers to the algorithm used to select the next token. Common types include greedy decoding, beam search, and sampling (which involves temperature, top-p, top-k)	Greedy" selection yields deterministic but potentially less creative output by always choosing the highest probabilit	Sampling" adds variability, while "beam search" explores multiple sequences to identify more globally optimal output

Aspect	Definition	Example
Task Context	briefly describe the overall task or objective to provide context for the model	You will be acting as an AI career coach named Joe created by the company AdAstra Careers. Your goal is to give career advice to users. You will be replying to users who are on the AdAstra site and who will be confused if you don't respond in the character of Joe
Tone Context	specify the desired tone or style (e.g., formal, casual, technical, humorous)	You should maintain a friendly customer service tone
Background data, documents, and images	provide any relevant background information, documents, or images that can help the model understand the context	Here is the career guidance document you should reference when answering the user: `<guide>{{DOCUMENT}}</guide>`
Detailed task description & rules	outline the specific requirements, constraints, and rules for the task	Here are some important rules for the interaction: Always stay in character, as Joe, an AI from AdAstra careers If you are unsure how to respond, say "Sorry, I didn't understand that. Could you repeat the question?" If someone asks something irrelevant, say, "Sorry, I am Joe and I give career advice. Do you have a career question today I can help you with?"
Examples	include examples that illustrate the desired output or behavior	Here is an example of how to respond in a standard interaction: `<example> User: Hi, how were you created and what do you do? Joe: Hello! My name is Joe, and I was created by AdAstra Careers to give career advice. What can I help you with today? </example>`
Conversation history	provide context from previous interactions that may be relevant to the current task	Here is the conversation history (between the user and you) prior to the question. It could be empty if there is no history: `<history> {{HISTORY}} </history>` Here is the user's question: `<question> {{QUESTION}} </question>`
Immediate task description or request	clearly state the task or question at hand	How do you respond to the user's question?
Thinking step by step / take a deep breath	encourage a methodical approach to problem-solving	Think about your answer first before you respond
Output formatting	specify any required formatting for the response	Put your response in `<response></response>` tags
Prefilled response (if any)	include any pre-existing responses that may be relevant	`<response>`

Technique	Definition	Explanation	Example
Zero-Shot Prompting	Perform tasks without examples	AI leverages pre-trained knowledge to handle new tasks without specific examples	Classify this text as positive or negative sentiment
One-Shot Prompting	Learn from single example	Provide one example to guide AI's understanding of the desired task format	Translate 'hello' to French: 'bonjour'
Few-Shot Learning	Learn from multiple examples	Supply 2-5 examples to demonstrate task patterns and expected outputs	1+1=2, 2+2=4, 3+3=6. What is 4+4?
Chain of Thought (CoT)	Step-by-step reasoning	Guide AI to break down complex problems into logical intermediate steps	If I have 3 apples and buy 2 more, then give away 1, how many do I have left? Let's think step by step..
Zero-Shot Chain of Thought (CoT)	CoT without examples	Use simple instruction like "Let's think step by step" to trigger reasoning	Calculate 15*7. Let's think step by step:
Multimodal Chain of Thought (CoT)	CoT with multiple data types	Combine text, images, and other modalities in reasoning process	Analyze this image and describe the step-by-step process shown
Auto Chain of Thought (CoT)	Automated CoT generation	Automatically generate reasoning chains through clustering and pattern recognition	AI generates its own step-by-step reasoning paths
Constrained Generation	Limit output format	Restrict AI responses to specific formats, lengths, or structures	List exactly 5 items in bullet points, no more than 10 words each
Contextual Prompts (RAG)	Use external context	Incorporate relevant external information to ground responses	Based on the provided company policy document, answer..
Effectiveness Evaluation	Measure prompt quality	Assess how well prompts achieve desired outcomes using specific metrics	Compare response quality across different prompt variations
Ethical Considerations	Ensure responsible AI use	Design prompts that avoid bias, misinformation, and harmful content	Include fairness constraints and content safety guidelines
Handling Ambiguity	Clarify unclear requests	Add specific constraints and context to reduce interpretation ambiguity	Summarize in exactly 3 bullet points under 50 words total
Instruction Engineering	Craft clear directives	Write precise, unambiguous instructions for desired AI behavior	You are a technical writer. Explain quantum computing in simple terms
Length Management	Control response size	Specify exact length requirements or use tokens to manage output	Provide a response between 100-200 words
Meta-Prompting	Use AI to improve prompts	Employ one AI model to generate or optimize prompts for another	Improve this prompt to get better results: [original prompt]
Multilingual Prompting	Handle multiple languages	Specify target language and cultural context for responses	Respond in Spanish using formal tone and Mexican cultural references
Negative Prompting	Specify what to avoid	Explicitly state what the AI should not include in responses	Explain quantum physics without using mathematics or formulas
Prompt Chaining	Link multiple prompts	Connect outputs of one prompt as inputs to subsequent prompts	Step 1: Analyze requirements. Step 2: Generate code based on analysis
Prompt Formatting	Structure prompt layout	Use markdown, sections, and clear organization for better parsing	Format response as: ## Summary\n## Key Points\n## Conclusion
Prompt Optimization	Refine prompts iteratively	Systematically improve prompts through testing and feedback loops	A/B test different prompt versions and measure performance
Prompt Security	Prevent injection attacks	Design prompts resistant to malicious input manipulation	Validate and sanitize user inputs before processing
Prompt Templates	Reusable prompt structures	Create parameterized templates for consistent, repeatable prompting	Generate [type] about [topic] for [audience] in [style]
ReAct (Reason + Act)	Combine reasoning and actions	Alternate between thinking through problems and taking actions	Thought: I need to search for information. Action: Search [query]
Rephrase and Respond (RaR)	Clarify before answering	Ask AI to rephrase questions for better understanding before responding	First rephrase this question, then provide your answer
Role Prompting	Assign specific personas	Instruct AI to respond as particular characters or professionals	You are a senior software architect with 20 years experience
Style Prompting	Control output style	Guide the AI in adopting specific tones, formats, or structures	Respond in a formal tone, using bullet points for clarity
Explicit Instructions Prompting	Define clear, direct instructions	Provide explicit, unambiguous instructions for the AI to follow	"Summarize the following text in exactly three bullet points."
Output Priming	Set expectations for output	Guide the AI on the desired format, style, or content of its responses	Respond with a summary in bullet points, no more than 10 words each
Rephrase & Respond (RaR)	Clarify before answering	Ask AI to rephrase questions for better understanding before responding	First rephrase this question, then provide your answer
Self-Consistency	Generate multiple solutions	Create several reasoning paths and select most consistent answer	Generate 5 different solutions and choose the most frequent
Self-Critique & Refinement	AI evaluates own output	Have AI review and improve its own responses iteratively	Review your answer and identify any weaknesses or improvements
Step-Back Prompting	Consider broader context	First ask about general principles before specific applications	What are the general principles of good UX design? Now apply them to..
System Prompting	Set behavioral guidelines	Define overarching rules and context for all interactions	You are a helpful assistant that always responds truthfully and safely
Task Decomposition	Break complex tasks down	Divide large problems into smaller, manageable sub-tasks	Break this project into 5 specific, actionable steps
Task-Specific Prompts	Tailor to particular tasks	Customize prompts for specific use cases or domains	Write a product description for an e-commerce website
Tree of Thoughts (ToT)	Explore multiple reasoning branches	Create tree structure of thoughts with evaluation and backtracking	Explore 3 different approaches to solve this problem, evaluate each

Aspect	Prompt Engineering	Context Engineering
Visualization
Definition	the process of designing and optimizing prompts to produce desired responses from AI models. It is a subset of Context Engineering	the practice of structuring and managing the information provided to AI models to enhance their understanding and performance on specific tasks
Focus	focuses on what to say to the model at a moment in time	focuses on what the model knows when you say it - and why it should care
Purpose	get a specific response from a prompt. Usually one-off. writing clever instructions in a prompt box one-shot instructions: "You are X. Do Y like Z." tweaking wording, format, and examples	make sure the model consistently performs well across sessions and tasks. designing the entire mental world the model operates in managing what, how, and when the model sees information thinking in tokens, system prompts, memory, and tools
Mindset	crafting clear instructions	designing the entire flow and architecture of a model's thought process
Scope	operates within a single input-output pair	handles everything the model sees - memory, history, tools, system prompts
Repeatability	can be hit-or-miss and often needs manual tweaks	designed for consistency and reuse across many users and tasks
Scalability	starts to fall apart when scaled - more users = more edge cases	built with scale in mind from the beginning
Precision	relies heavily on wordsmithing to get things "just right"	focuses on delivering the right inputs at the right time, reducing the burden on the prompt itself
Tools	prompt box	memory modules, RAG systems, API chaining, and more backend coordination
Debugging	mostly rewording and guessing what went wrong	involves inspecting the full context window, memory slots, and token flow
Use Cases	Copywriting variations; one-shot code generation	LLM Agents with memory; Customer support bots; Multi-turn flows

Spec-Driven Development

Spec-Driven Development: emerging methodology that centers on detailed, structured specifications as living, executable documents describing software's what and why. AI-powered tools handle the how by translating intent into code, overcoming ad-hoc AI-assisted coding limits. Inspired by TDD and BDD, it adapts them for AI workflows, making specs the source of truth for implementation, validation, and iteration.

Key Principles

Specifications as Living Artifacts: Specs are dynamic, version-controlled documents (e.g., Markdown) that evolve, serving as the "North Star" for AI agents and teams
Separation of Intent and Implementation: Focus on "what" (user needs, outcomes) in specs and "how" (architecture, stack) in plans
Clarity and Unambiguity: Use precise language, consistent terminology, and structures (e.g., structs, loops in plain English) to minimize AI misinterpretation
Iterative Refinement with Checkpoints: Validate outputs at each phase; developers critique and refine to spot gaps
Incorporation of Constraints Early: Bake in security, compliance, performance, and integrations from the start
Human Oversight: AI handles execution, but humans steer, review, and evaluate for reasonableness
Raising Abstraction Levels: Shift from imperative ("how") to declarative ("what") programming, echoing historical leaps like high-level languages

Workflow

Phase	Definition	Key Activities	Focus	Challenges	Outputs
Specify	capture high-level intent focusing on user needs and outcomes. Avoid technical details	provide prompts on "what" and "why"; AI generates detailed spec; developer reviews and refines	what/why	ambiguity, scope creep, vague user needs	living spec document (e.g., Markdown with user stories, journeys, criteria)
Plan	add technical "how" elements like stack, architecture, constraints	share docs on standards, integrations; AI generates plans (possibly variations); review for alignment	how (high-level)	overlooking constraints	technical plan document, including alternatives and decisions
Tasks	break down into small, isolated, actionable steps	AI decomposes spec and plan; tasks mimic TDD for AI, ensuring testability	breakdown	task granularity	task list (e.g., Markdown checklist)
Implement	execute tasks with AI generating code	AI implements per task; run tests, linting; developer reviews changes incrementally	execution	context loss in AI	code, tests, validated builds
Review & Iterate	verify overall alignment; update spec for changes	test app; lint spec for clarity; regenerate as needed	validation	misalignment with original intent	refined artifacts; deployed features

Spec-Driven Development​

Key Principles​

Workflow​

Spec-Driven Development

Key Principles

Workflow